Enriching Knowledge Domain Visualizations: Analysis of a Record Linkage and Information Fusion Approach to Citation Data

نویسنده

  • Marie Synnestvedt
چکیده

This article presents a study of the use of data preparation for data mining methodology to prepare biomedical citation data for visualization. Deterministic record linkage models were compared with probabilistic record linkage in a situation for which the truth is known through the use of gold standard or truth datasets. The linkages are evaluated on data from the Web of Science (WOS) and Medline citation databases. Sensitivity, specificity, and overall performance of record linkage models were empirically compared with ROC analysis. Data quality and visualization metrics are presented for datasets prepared with and without probabilistic record linkage and information fusion of Medline abstracts and MESH terms into WOS citation records. The major contributions of this work are to specifically develop a novel model of record linkage for biomedical citation databases, with the objective of improving and enriching biomedical knowledge domain visualizations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Preparation for Biomedical Knowledge Domain Visualization: A Probabilistic Record Linkage and Information Fusion Approach to Citation Data

Data Preparation for Biomedical Knowledge Domain Visualization: A Probabilistic Record Linkage and Information Fusion Approach to Citation Data Marie B Synnestvedt Xia Lin Ph.D. This thesis presents a methodology of data preparation with probabilistic record linkage and information fusion for improving and enriching information visualizations of biomedical citation data. The problem of record l...

متن کامل

Probabilistic Linkage of Persian Record with Missing Data

Extended Abstract. When the comprehensive information about a topic is scattered among two or more data sets, using only one of those data sets would lead to information loss available in other data sets. Hence, it is necessary to integrate scattered information to a comprehensive unique data set. On the other hand, sometimes we are interested in recognition of duplications in a data set. The i...

متن کامل

Valuing Indirect Citations in Citation Networks using Data Fusion

Any scientific activity requires awareness of previous related activities. Citation networks are the networks in which each document is compared as a link of a chain with its previous and next documents, and the documents with the highest number of citations are considered as the most effective ones in a domain. Most of the introduced methods use direct citations for valuing the documents. One ...

متن کامل

Selection of Core Dental Journals using Citation Analysis of Scientific Journals in the Field of Dentistry

  Objective: Citation analysis is a bibliometric method used for the calculation of the mean number of citations and detection of highly cited references, frequency distribution of citations in different languages, update rate, efficacy of resources and number of core journals of a scientific domain. This process elucidates the information needs of researchers in a specific realm of science and...

متن کامل

The thematic and citation landscape of Data and Knowledge Engineering

The thematic and citation structures of Data and Knowledge Engineering (DKE) (1985–2007) are identified based on text analysis and citation analysis of the bibliographic records of full papers published in the journal. Temporal patterns are identified by detecting abrupt increases of frequencies of noun phrases extracted from titles and abstracts of DKE papers over time. Conceptual structures o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • AMIA ... Annual Symposium proceedings. AMIA Symposium

دوره   شماره 

صفحات  -

تاریخ انتشار 2007